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We present a study by linear stability analysis and large-scale Monte Carlo simulations of a sim- 
ple model of biological coevolution. Selection is provided through a reproduction probability that 
contains quenched, random interspecies interactions, while genetic variation is provided through a 
low mutation rate. Both selection and mutation act on individual organisms. Consistent with some 
current theories of macroevolutionary dynamics, the model displays intermittent, statistically self- 
similar behavior with punctuated equilibria. The probability density for the lifetimes of ecological 
communities is well approximated by a power law with exponent near —2, and the corresponding 
power spectral densities show 1/f noise (flicker noise) over several decades. The long-lived commu- 
nities (quasi-steady states) consist of a relatively small number of mutualistically interacting species, 
and they are surrounded by a "protection zone" of closely related genotypes that have a very low 
probability of invading the resident community. The extent of the protection zone affects the sta- 
bility of the community in a way analogous to the height of the free-energy barrier surrounding 
a metastable state in a physical system. Measures of biological diversity are on average station- 
ary with no discernible trends, even over our very long simulation runs of approximately 3.4 x 10^ 
generations. 

PACS numbers: 87.23.Kg 05.40.-a 05.65.-l-b 



I. INTRODUCTION 



Biological evolution offers a number of important, unsolved problems that are well suited for investigation by 
methods from statistical physics. Many of these can be studied using complex, interacting model systems far from 
equilibrium |l| . Areas that have generated exceptional interest among physicists are those of coevolution and speci- 
ation. A large class of coevolution models have been inspired by one introduced by Bak and Sneppen 0, in which 
species with different levels of "fitness" compete, and the least fit species and those that interact with it are regularly 
"mutated" and replaced by new species with different, randomly chosen fitness. Models in this class exhibit avalanches 
of extinctions and appear to evolve towards a self-organized critical state Q . Although such models may be said to 
incorporate Darwin's principle of "survival of the fittest," they are artificial in the sense that mutation and selection 
are assumed to act collectively on entire species, rather than on individual members of their populations. 

In reality, mutations are changes in the genotypes of individual organisms that are introduced or passed on during 
reproduction. These changes in the genotype affect the phenotype (physical and behavioral characteristics of the 
organism and its interactions with other organisms), and it is on the level of the phenotypes of individuals that 
competition and selection act. A number of evolution models (see, e.g., Ref. m for a review from a physicist's point 
of view) therefore take as their basis a genome in the form of a string of "letters" as in Eigen's quasi-species model 
of molecular evolution |^ . Depending on the level of the modeling, the letters of the genomic "alphabet" can be a 
(possibly large) number of alleles at a gene locus, or the four nucleotides of a DNA or RNA sequence |3- However, 
the size of the alphabet is not of great importance in principle, and it is common in models to use a binary alphabet 
with the two letters and 1 (or ±1) HHHQ. 

We believe a fruitful approach to the study of coevolution is one in which selection is provided by interspecies 
interactions along lines commonly considered in community ecology, while genetic variation is provided by random 
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mutations in the genomes of individual organisms. An early attempt in this direction is the coupled NK model 
with population dynamics introduced by Kauffman and Johnsenj|8y&| , but thus far not many similar models have 
been studied. Recently, Hall, Christensen, and collaborators [ToLlllLll2 | introduced a model they called the "tangled- 
nature" model, in which each individual lives in a dynamically evolving "fitness landscape" created by the populations 
of all the other species. Here we consider a simplified version of this model, in which no individual is allowed 
to live through more than one reproduction cycle. This restriction to nonoverlapping generations enables us to 
both study the model in detail by linear stability analysis and to perform very long Monte Carlo simulations of its 
evolutionary behavior. In short, the model consists of populations of different species, on which selection acts through 
asexual reproduction rates that depend on the populations of all the other species via a constant, random interaction 
matrix that allows mutualistic, competitive, and predator/prey relations. In addition to these direct interactions, all 
individuals interact indirectly through competition for a shared resource. The competition keeps the total population 
from diverging. Genetic variety is provided by a low mutation rate that acts on the genomes of individual organisms 
during reproduction, inducing the populations to move through genotype space. The resulting motion is intermittent: 
long periods of stasis with only minor fluctuations arc interrupted by bursts of significant change that is rapid on a 
macroscopic time scale, reminiscent of what is known in evolutionary biology as punctuated equilibria [13) llJ) ■ A 
preliminary report on some aspects of the work reported here is given in Ref . |l6| . 

The main foci of the present paper are the structure and stability of communities and the statistical properties of 
the dynamical behavior on very long ("geological") time scales. The rest of the paper is organized as follows. In Sec. ITU 
we describe our model in detail, including the detailed Monte Carlo algorithm used for our simulations. In Sec. lIIII we 
discuss the properties of the fixed points of the population for the mutation-free version of the model. Many of these 
properties can be understood analytically within a simple mean-field approach. A full, probabilistic description of our 
model is also possible, although the mathematics is somewhat involved. In Appendix 1X1 we provide, for the sake of 
completeness, the key equations of this approach. In Sec. II VI we give a detailed report on our large-scale Monte Carlo 
simulations and the numerical results, together with a discussion of their relations to the fossil record and to other 
theoretical models. Finally, in Sec. El we summarize our results and give our conclusions and suggestions for future 
studies. 



II. MODEL AND ALGORITHM 



As mentioned in Sec. IB we use a simplified version of the tangled-nature model introduced by Hall, Christensen, and 
collaborators [lOllTll IT ^ . It consists of a population of individuals with a genome of L genes, each of which can take 
one of the two values or 1. Thus, the total number of different genotypes is Afmax = 2^. We consider each different 
genotype as a separate species, and we shall in this paper use the two terms interchangeably. This is justified by the 
idea that in the relatively short genomes we can consider computationally, each binary "gene" actually represents a 
group of genes in a coarse-grained sense. 

In our version of the model, the population evolves in discrete, nonoverlapping generations (as in, e.g., many insects), 
and the number of individuals of genotype / in generation t is nj(t). The total population is Ntot(t) = '^jni{t). 
In each generation, the probability that an individual of genotype / produces a litter of F offspring before it dies is 
Pj{{nj{t)}), while the probability that it dies without offspring is 1 — Pj. Although the fecundity F could be quite 
complex in reality (e.g., a function of / on the average, but random both in time and for each individual), we here 
take a siniplistic approach and assume it is a constant, independent of / and t. The main difference from the model of 
Refs. |l0lllllll2l | is that the generations are nonoverlapping in our model, while individuals in the original model may 
live through several successive reproduction cycles. This simplification facilitates theoretical analysis and enables us to 
make significantly longer simulations than those reported in Refs. (lOLITllIT^ . In the "opposite" direction, the earlier 
model could be generalized to include non-trivial "age structures," since individuals living through several cycles can 
be assigned an age, with age-dependent survival and reproductive properties 
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As in the original model [lOj, IllL Il2| , the reproduction probability Pj is taken as 

Pi{{nj{t)}) = 1 _,_ [- J2j Mijnj{t)/Ntot{t) + 7Vtot(0/^o] ' 

Here, the Verhulst factor A'o 0| represents an environmental "carrying capacity" that might be due to limitations on 
shared resources, such as space, light, or water. It prevents the total population from indefinite growth, stabilizing it 
at O (No). The interaction matrix M expresses pair interactions between different species such that the element Mjj 
gives the effect of the population density of species J on species /. Thus, a mutualistic relationship is represented 
by both Mjj and Mjj being positive, while both being negative models a competitive relationship. If they are of 
opposite signs, we have a predator-prey situation. To concentrate attention on the effects of interspecies interactions. 



we have a p 

we follow Refs. [Tol Ull lia | in setting the self-interactions A/// = 0. The offdiagonal elements of Mjj are randomly 
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and uniformly distributed on [—1, 1] as in Ref. [TJ]. The interaction matrix is set up at the beginning of the simulation 
and is not changed later, a situation that corresponds to quenched disorder in spin-glass models 27]. We note that 
we have not attempted to give M a particularly biologically realistic form. Some possible modifications are discussed 
in Sec. |V| 

In this model there is a one-to-one correspondence between genotype (the Ith specific bit string) and phcnotype 
(the Ith row and column of M). Thus, the phcnotype specifies both how the Ith species influences the other species 
that are cither actually or only potentially present in the community (the Ith column) and how it is influenced by 
others (the Ith row). The reproduction probability Pj provides selection of the "most fit" phenotypes according to 
these "traits" (matrix elements) and the populations of the other species present in the community. The effect (or 
lack thereof) of a particular trait depends on the community in which the species exists, just as a cheetah's superior 
speed is only relevant to survival if fast-moving prey is available. 

In each generation, the genomes of the individual offspring organisms are subjected to mutation with probability 
fi/ L per gene and individual. Mutated offspring are re-assigned to their new genotypes before the start of the next 
generation. This provides the genetic variability necessary for evolution to proceed. 

The analytic form of Pi{{nj{t)}), Eq. is by no means unique. For instance, one could use a "soft dynamic" 
in which the effects of the interactions and the Verhulst factor factorize in P/. Alternatively, one could dispense 
with the exponential and use linear or bilinear relationships instead, as is most common in population biology |29|. 
Investigations of the effects of such modifications are left for the future. 

Our evolution algorithm proceeds in three layers of nested loops. 

1. Loop over generations t. 

2. Loop over the Af{t) populated genotypes / in generation t. 

3a. Loop of length n/(t) over the individuals of genotype /. Each individual produces F offspring for generation 
t+1 with probability Pi{{nj{t)}), or dies without offspring with probability 1 — Pj. In either case, no individual 
survives from generation t to generation t + I. 

3b. Loop of length equal to the total number of offspring of genotype / generated in loop 3a, attempting to mutate 
each gene of each individual offspring with probability fi/ L and moving mutated offspring to their respective 
new genotypes for generation t+1. 



III. LINEAR STABILITY ANALYSIS 



Though neither our model nor the simulations are deterministic, a number of their gross properties can be under- 
stood in terms of a mean-field approximation that ignores statistical fiuctuations and correlations. The time evolution 
of the populations is then given by the set of difference equations, 

ni{t + 1) = nj{t)FPi{{nj{t)})[l - fi] + {^,/L)F ^ (i)P^(,) ({nj(i)}) + 0{pi^) , (2) 

Kil) 

where J2k{i) i'^s over the L genotypes K{I) that differ from / by one single mutation (i.e., the Hamming distance 
[s^ Hki = 1). The corrections of 0{^^) correspond to multiple mutations in single individuals. Naturally, a full 
investigation of this equation is highly non-trivial, even in the absence of random noise. In particular, Eq. |(5J) can be 
regarded as a logistic map in a 2^-dimensional space. Logistic maps are known to admit, in general, both fixed points 
and cycles of non-trivial periods [l^. To keep both analysis of simulations and theoretical investigations manageable, 
we choose parameters in such a way that we can focus our attention on fixed points. 

Of course, what we simulate are stochasticprocesses. However, as long as the mutation rate is well below the error 
threshold for mutational melt-down P, H, H, Q, Efl US i "^^^ successful mutations can become fixed in the population 
before another successful mutant arises. This results in a separation of time scales between the ecological scale 
of fluctuations within fixed-point communities and the evolutional scale of the durations of such communities |3lj |. 
which are known as quasi-steady states (QSS) or quasi evolutionarily stable strategies (q-ESS) JLO, llj. By ignoring 
mutations, we can make some analytical progress and gain some insight into the nature and stability properties of 
these QSS. 
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A. A/^-species fixed points 



A fixed point of Eq. (0), defined by 



n*j (t + I) = n*i (t) = n 



I J 



(3) 



is cliaracterized by liaving only J\f (< 2^) non-vanishing n'j. (Henceforth, an asterisk wiU signify a quantity as a fixed- 
point value.) Following Ref. we denote a fixed point as feasible if > for all J\f values of /. Corresponding 
to the coexistence of J\f genotypes, such a point will be simply referred to as an "A/'-species fixed point." When 
mutations are ignored, Eq. Q reduces to 



ni{t + 1) ^ ni{t)FPi{{nj{t)}) . 



(4) 



Specializing to the form of Pj given by Eq. we see that all single-species fixed points are trivially "identical," 
with nj = A^jQj = Nq ln{F — 1). This somewhat unrealistic result is just a consequence of our choice of M// = and 
can be avoided by lifting this restriction. However, the absence of self-interactions places no restriction on the main 
purpose of our work - the exploration of the effects of random but time independent interspecies interactions. 

Proceeding to the N > 2 cases, the existence of an AT-species fixed point depends critically on the submatrix 
M, with matrix elements Mjj in which both / and J are among the M species in question. (We shall use the 
tilde to emphasize that a quantity corresponds to an A/'-species subspace, rather than to the full, 2^-dimcnsional 
genotype space.) In particular, if M is non-singular, then a unique fixed point exists, as we show below. On the other 
hand, singular M's may result in a variety of "degenerate" cases. We provide just two examples to illustrate the 
mathematical richness of singular M's. If all elements vanish, then the behavior in this subspace is highly degenerate, 
with the total population given again by N^^^ = Nq \n{F — 1), regardless of TV. However, the fractions of each species, 



* + / AT"* 

Pi = ni/N^^ 



(5) 



are completely undetermined, an understandable consequence of having dynamically indistinguishable species. An 
other example of a singular interaction submatrix is M = ™ > 



Q Q , in an A/" = 2 subspace. Then, Eq. will 

drive one of the species to extinction, so that no fixed point with both n/ > (stable or unstable) can exist in this 
two-dimensional subspace, and the system collapses to N = 1. For the remainder of this article, we shall study 
analytically only the dynamical behavior of interacting species with nonsingular M's. 
To proceed, we insert Eq. © into Eq. and arrive at the N equations 



FPi{{n*j}) = 1 . 



With our choice of P/, these lead to 



J 



Mijp} 



N* 



No 



-ln(F- 1) 



(6) 



(7) 



(The analysis in this section remains valid even if Mjj ^ 0, although Mjj = is the case explicitly considered 
elsewhere in this paper.) Note that the right-hand side of Eq. ((7|) is independent of /. Since M is non-singular, we 
define its inverse by W, with elements Wjj: 



Wij = (M"^) 



(8) 



Then Eq. Q can be inverted to give 



Pi 



No 



ln(F- 1) 



(9) 



The only unknown in this equation, N^^^, can now be found via the normalization condition, P/ = which leads 
to 



where 



NUNo = MF~l) + l/t, 



t = Y,wij 

ij 



(10) 

(11) 
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Putting these into Eq. ®, we have the exphcit form of the fixed-point populations: 



n} = No 



\n{F-l) + l 



. (12) 



Although Eq. 112|l appears to provide fixed-point values for any choice of control parameters, we emphasize that 
it is applicable only for a limited range of F and M. The subtlety lies in its stability properties within the A/'-species 
subspace. First, we remind the reader that, even in the case of A/" = 1, there are a variety of behaviors. Solutions may 
have a stable fixed point with either monotonic or oscillatory decay of small deviations, or they may show bifurcations, 
period doubling, and chaos In the present study, we are interested in the effects of the interspecies interactions, 
and we therefore choose to focus on systems with monotonically decaying fluctuations in the noninteracting limit. 
This means that small fluctuations about the fixed point must decay as d{t + 1)/S{t) E (0, 1). For the single-species 
and M = cases, this restriction is easily translated to the condition 2 < F < 4.5 for the fecundity. We use F = A 
in all of our simulations. 

Next, through a straightforward linear stability analysis around the A/'-species fixed point, we obtain a condition 
that represents decay of all small perturbations, i.e., all the eigenvalues of the matrix 



dni{t+l) 



dnj{t) 



(13) 



must have real parts, lying between —1 and 1. Carrying out the differentiation and using Eq. Q to simplify the 
result, we find this matrix to be of the form 1 -|- A, where 1 is the A/'-dimensional unit matrix and the elements of A 
are given by 

A/j = (i - P*i i^^iJ - HF - 1) - 2/s) . (14) 

Our criterion translates to the requirement that all the eigenvalues of A must have real parts that lie in (—2,0). 
Fixed points with this property will be called "internally stable." Unfortunately, this criterion cannot be made more 
explicit. Given F and a set of Mjj, A must be constructed using Eqs. (|5|)- (|ll|l and diagonalized. The matrix A is 
recognized as what is known in ecology as the community matrix |22j of the A/'-species fixed point for the discrete-time 
dynamic defined by Eq. 10}. 

It has been shown, both numerically |33l | and analytically |34l| . that the proportion of large, random matrices for 
which all eigenvalues have a negative real part vanishes as the proportion of nonzero matrix elements increases. This 
has been used as an argument that highly connected ecosystems are intrinsically unstable Q| , contrary to ecological 
intuition. However, the matrix that must be studied to determine the internal stability of a fixed point is the 
community matrix A, which has a complicated relationship to the interaction matrix M and should not be expected 
to have a simple element distribution. In fact, for some bilinear population dynamics models there is numerical 
evidence that most feasible fixed points are also internally stable |3^l35j|. However, the relations between connectivity 
and stability have not yet been fully clarified and are still being discussed [s^ [s^l ■ 

Of course, issues of internal stability are somewhat academic for simulations. In practice an internally unstable 
fixed point could not be observed for more than a brief time, especially since the populations would be driven away 
from such fixed points by the noise due to both the birth/death process and the mutations. 



B. Stability against other species 

Since mutations are essential for the long-time evolutionary behavior, the "external stability" properties of the 
fixed-point communities are important. Even if the population corresponds to an internally stable A/'-species fixed 
point, mutations will generate small populations of "invader" species, i.e., genotypes outside the resident A/'-species 
community. Denoting such invaders by the subscript i and linearizing Eq. I^J about the A/'-species fixed point, we see 
that the important quantity is the multiplication rate of the invader species in the limit of vanishing Ui/Ntot, 

n,{t + l) _ F ^^^^ 

l + (F-l)exp 

To obtain this result we exploited Eqs. (|10|l and (|12|l . Explicitly, the condition for stability against the invader, 
Uiit -\- \)/ni{t) < 1, reduces to the requirement that the argimrent of the exponential function in Eq. (|15|l must be 
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positive. We also note that, if Mij = for all J in the resident community, then the multiplication rate of the invader 
equals unity. In population biology the Lyapunov exponent In [ni{t + l)/ni(t)] is known as the invasion fitness of the 
mutant with respect to the resident community [sH Is^ . From Eq. p5|l it becomes clear that the success of an invader 
depends not only on its direct interactions Mij with each of the resident species, but also on the interactions between 
the resident species through the inverse interaction submatrix Wjk- 

IV. SIMULATION RESULTS 

The model described in Sec. ITTl was studied by Monte Carlo simulations with the following parameters: L = 13, 
F = 4, A^o = 2000, and /i — 10~^. The random matrix M (with zero diagonal and other elements randomly distributed 
on [—1,1]) was chosen at the beginning of each simulation run and then kept constant for the duration of the run. In 
this regime both the number of populated species M{t) and the total population Ntot{t) ~ Nq ln(F — 1) « 2200 are 
substantially smaller than the number of possible species, Afmax ^ 2^ ~ 8192. This appears biologically reasonable in 
view of the enormous number of different possible genotypes in nature. (But see further discussion in Sec lIVEl and 
Appendix 0) In a QSS, the average number of mutant offspring of species / in generation t + 1 is approximately 
IJ,ni{t). Thus, with the mutation rate used here, each of the dominant species will produce of the order of one mutant 
organism per generation. As shown in Sec. IIVBI during a QSS most of these mutants become extinct after one 
generation. However, due to the small genome size, the same mutant of a species with a large population will be 
re-generated repeatedly by mutation from the parent species. 

A high-population genotype with its "cloud" of closely related low-population mutants could be considered a quasi- 
species in the sense of Eigen, with the high-population genotype as the "wildtype" Ji, iS, iJ, 0, ^3 ■ An alternative 
interpretation of the model is therefore as one of the coevolution of quasi-species jl^ , in which the successful invasion 
of a resident community by a mutant represents a speciation event in the lineage of the parent genotype. 

A. General features 

Most of our simulations were started with a small population of 100 individuals of a randomly selected single 
genotype, corresponding to the entry into an empty ecological niche by a small group of identical individuals. How- 
ever, runs starting from a random number of small populations of different species give essentially the same results. 
Generally, the initial species are not likely to be stable against mutants, and they usually become extinct within 100 
generations. To eliminate any short initial transients, most of the quantitative analyses presented here are based on 
time series from which the first 4096 generations were removed. The simulated quantities were recorded every 16 
generations. This time resolution was chosen to be just larger than the average time it would take for the descendants 
of a single individual of a mutant genotype i to completely replace a resident genotype J of Nq = 2000 individuals 
in the case of a maximally aggressive mutant, M^j = -f 1 and Mj, = — 1. This time, which is obtained by numerical 
solution of Eq. I|15|) . is about 15 generations and represents a "minimum growth time" for the model. Sampling at 
shorter intervals would mostly add random noise to the results, while sampling on a much coarser scale could miss 
important evolutionary events. It is easy to show, by solving Eq. H15|) analytically for short times, that the growth 
time and thus the optimal sampling interval increases logarithmically with Nq. Most quantitative results in this paper 
are based on averages over 16 independent simulation runs of 2^^ — 33 554432 generations each [s^ . 

We define the diversity of the population as the number of species with significant populations, thus excluding 
small populations of mainly unsuccessful mutants of the major genotypes. Operationally we define the diversity as 
D(t) = cxp [S i{ni{t)})], where S is the information-theoretical entropy |40ll41| . 

S{{niit)})^- /^/Wlnp/(t) (16) 

{I\pi{t)>Q} 

with 

pi{t)^ni{t)/Ntot{t) . (17) 

This measure of diversity is known in ecology as the Shannon- Wiener index 42] . It is different from the definition of 
diversity as the number of populated species (known in ecology as the species richness 0) ^^^^ used in Refs. [lO|, 
EH ]. The entropy-based measure significantly reduces the noise during QSS, caused by unsuccessful mutations. 

The Shannon- Wiener diversity index is shown in Fig. Q^a) for a simulation of 10^ generations, together with the 
normalized total population, Ntot{t) /[NQ\n{F — 1)]. We see quiet periods during which D{t) is constant except for 
small fluctuations, separated by periods during which it fluctuates wildly. The total population Ntot{t) is enhanced 
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relative to its noninteracting fixed-point value during the quiet periods, while it decreases toward the vicinity of this 
value during the active periods. 

To verify that different quiet periods indeed correspond to different resident communities {ni}, we show in Fig. [3 
the genotype labels / (integers between and 2^ — 1, corresponding to the decimal representation of the genotype bit 
string) versus time for the populated species. The populations of the different species are indicated by the grayscale 
(by color online). We see that, in general, the community is completely rearranged during the active periods, so 
that the quiet periods can be identified with the QSS of the ecology. An exception is afforded by the event near 
t = 1.5 X 10^ generations in Figs.^^a) and|2a). In this instance the populations of the dominant species decrease, 
and the total population is for a brief time spread over a larger number of species. However, the dominant species 
"regain their footing," and the original resident community continues for approximately another 50 000 generations. 
This situation is reminiscent of rare events in nucleation theory , in which a fluctuation of the order of a critical 
or even a supercritical droplet nevertheless may decay back to the metastable state. 



B. Stability of communities against invaders 

To investigate the stability properties of individual communities against invaders, we chose from a particular 
simulation run of 10^ generations ten different QSS of durations longer than 20 000 generations. The genotype 
labels and fixed-point populations that characterize these QSS are given in the first four columns of Table ^ The 
population-weighted average of the Hamming distances Hjj between pairs of genotypes in the community. 



/ j>i 

and the corresponding standard deviation, 



are shown in the fifth and sixth column, respectively. Even though the community as a whole moves far through the 
2^-dimensional population space, the different QSS communities are seen to retain the property that they consist 
of relatively close relatives (which of course are all descendants of the single, initial genotype). In Sec. lIVEl we 
demonstrate that {H) and aH remain in the range shown in Tabled even during very long simulations. 

The averages and standard deviations of the offdiagonal interaction-matrix elements M/j between the community 
members are shown in the seventh and eighth column of Tabled respectively. They show that the QSS communities 
are strongly mutualistic, as was also observed for the stable states in Ref. In contrast, the matrix elements of 22 
feasible, but otherwise randomly chosen, communities l45l | were found to be approximately uniformly distributed 
on [-1,1]. 

In Fig.|21we show histograms of the multiplication rates Eq. 115f) (i.e., the exponential function of the invasion fitness) 
of mutants that differ from the resident species by one single mutation (nearest-neighbor species, minj Hij — 1), and 
by two mutations (next-nearest-neighbor species, minj Hij = 2) |4^ . Only a very small proportion of the nearest- 
neighbor mutants have a multiplication rate above unity [Fig.|3{a)]. (In our sample, this very small proportion came 
from just one of the ten QSS considered.) Among the next-nearest neighbors, on the other hand, a not insignificant 
proportion may be successful invaders [Fig.l^Jb)]. The picture for third-nearest neighbors (not shown) is essentially the 
same as for the next-nearest neighbors. We also considered the multiplication rates for neighbors of the two most long- 
lived QSS observed in our simulations, which lasted about 1.0 x 10^ and 1.4 x 10^ generations, respectively. They both 
had no nearest neighbors with multiplication rates above unity, and the proportions for second-and third-neighbors 
were significantly smaller than those included in Fig.O In both parts of Fig. O are also shown the multiplication rates 
for neighbors of the 22 randomly chosen, feasible communities introduced in the previous paragraph. In that case, 
approximately half of the neighbors at any Hamming distance have multiplication rates above unity. 

Summarizing the results from Table and Fig. 13 we see that a long-lived QSS community is characterized by 
strongly mutualistic interactions and is surrounded by a "protection zone" of closely related genotypes that are very 
unlikely to successfully invade the resident community. The evidence from the two most long-lived QSS indicates that 
the average lifetime of a QSS is positively correlated with the extent of the protection zone. By contrast, a randomly 
chosen, feasible community has an approximately uniform distribution of Mjj over [—1,1]. It also typically has no 
protection zone and so is more vulnerable to invasion than a QSS. It is clear from the fact that none of our randomly 
chosen feasible communities turned out to be a QSS that QSS are relatively rare in this model, even among feasible 
communities. We believe this may be a favorable condition for continuing evolution, as the ecology can move rapidly 
from QSS to QSS through series of unstable communities. 
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C. QSS lifetime statistics 



The protection zone surrounding a QSS community acts much hke the free-energy barrier that separates a metastable 
state in a physical system from the stable state or other metastable states. In both cases a sequence of improbable 
mutations or fluctuations is required to reach a critical state that will lead to major rearrangement of the system. It 
is thus natural to investigate in detail the lifetime statistics of individual QSS in the way common in the study of 
metastable decay ^3 ■ To this effect we started simulations from each of the ten QSS communities discussed in 
Sec. rrVRl a,nd ran until the overlap function with the initial community, 



(20) 



became less than 0.5, at which time the system was considered to have escaped from the initial state. (We note that 
the overlap function defined here is simply the cosine of the angle between two unit vectors in the space of population 
vectors {nj}.) T he p recise value of the cutoff used is not important, as long as it is low enough to exclude fluctuations 
during the QSS [43. The composition of the population at this time (the "exit community") was then recorded 
and the run terminated. Each individual initial QSS was simulated for 300 independent escapes. The means and 
standard deviations of the individual lifetime distributions are shown in the last two columns of Table ^ The mean 
lifetimes were found to range over about one order of magnitude, from approximately 10 000 to 80 000 generations. In 
most cases the individual lifetime distributions were exponential (for which the standard deviation equals the mean), 
or they had an exponential component that described the behavior in the short-time end of the range of observed 
lifetimes. In about half of the QSS studied, a tail of very long lifetimes beyond the exponential part (indicated by a 
standard deviation significantly larger than the mean) was also observed. We further found that an initial QSS does 
not always escape via the same exit community. Typically, the genotypes in the exit community that are not present 
in the initial QSS differ from those in the original community by a Hamming distance of two or three. This picture 
is quite consistent with the stability properties of the QSS discussed in Sec. IIVBI 

The range of the lifetimes of the ten QSS discussed above was limited: for a period to be identified by eye as a QSS, 
its lifetime could not be too short, while the lifetimes were bounded above by the length of the simulation run. The 
variation of a factor of ten within these limits indicates that the dynamical behavior of the system may display a very 
wide range of time scales. Another indication to this effect is provided by the details of the activity within periods that 
appear as high-activity in Figs.^a) and|3a). Such detail is shown in Figs.^b) andj^Jb). Statistically, the picture 
in these expanded figures is similar to the one seen on the larger scale, with shorter quiet periods punctuated by even 
shorter bursts of activity. These observations, which are similar to Refs. JTj, suggest statistical self-similarity at 
least over some range of time scales. Indications of self similarity have also been seen in fossil diversity records j48i | . 

The suggestion of self-similarity that emerges from Figs. ^ and |2 makes it natural to investigate the statistics of the 
durations of active periods and QSS over a wider range of time scales than that represented by the ten QSS included 
m Table ID From Fig. [H we see that a reasonable method to make this distinction is to observe the magnitude of 
the entropy changes, \dS{t)/dt\, and to consider the system to be in the active state if this quantity is above some 
suitably chosen cutoff. This cutoff can best be determined from a histogram of dS/dt, such as shown in Fig. 0Ja). 
It shows that the probability density of the entropy derivative consists of two additive parts: a near-Gaussian one 
corresponding to population fluctuations caused mostly by the birth/death process in the QSS, and a second one 
corresponding to large changes during the active periods. From this figure we chose the cutoff as 0.015, which was 
used to classify every sampled time point as either active or quiet. Normalized histograms for the durations of the 
active and quiet periods are shown in Fig.^Jb). About 97.4% of the time is spent in QSS. The acti ve p eriods are seen 
to be relatively short, and their probability density is fitted well by an exponential distribution [i^. On the other 
hand, the lifetimes of the QSS show a very broad distribution jTll | - possibly a power law with an exponent near —2 
for durations longer than about 200 generations. We note that there is some evidence of power laws with exponents 
near —2 in the distributions of several quantities extracted from the fossil record, such as the life spans of genera or 
families, and the sizes of extinctions |50ll5lll52 . l53| . However, the fossil data are sparse and extend over no more than 
one or two decades in time, and they can be fitted almost as well by exponential distributions (s^l . Although our data 
extend over a much wider range of time scales than the paleontological data, indisputable evidence of a power law 
remains to be established. Though a few QSS of the order of one million generations will certainly appear in every run 
of 2^^ generations, more definite conclusions about the statistics of such long QSS must await simulations an order of 
magnitude longer. Another feature in Fig. 2Ib) is that the distribution of QSS durations changes to a smaller slope 
below about 200 generations. A similar effect has been observed in the fossil record of lifetimes of families (5^ ,5^ . 
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D. Power spectral densities 



From the discussion in Sec. lIVOl it is clear that by using a cutoff for the intensity of some variable which is large in 
the QSS and small in the active periods (but in both cases with fluctuations of unknown distribution) , it is impossible 
to classify the periods unambiguously. For example, by increasing the cutoff one can make appear as a single QSS 
period what previously appeared as two successive, shorter QSS periods separated by a short active period. Since 
the probability of encountering an extremely large fluctuation in a QSS increases proportionally to the length of the 
QSS, this effect will affect the longest periods most severely. The problem with determining a suitable cutoff can 
be avoided by, instead of concentrating on period statistics, calculating the power spectral density (PSD) [3 
of variables such as the Shannon- Wiener diversity index or the total population. PSDs for these two quantities are 
shown in Fig. The PSD for the diversity D(t), shown in Fig. [3^a), appears inversely proportional to the frequency 
/ (f// noise or flicker noise [H^) for / > fO~'', goes through a crossover regime of about one decade in / where it is 
~ with a > 1, and then appears to return to a « f for 10~^ < / < 10~^. For / < 10~^ our data are insufficient 
to determine the PSD unambiguously, and much longer simulations would be necessary [s^ . In the PSD for iVtot(i), 
shown in Fig. El^b), the substantial population fluctuations due to the birth/death process during the QSS periods 
produce a large noise background which interferes with the interpretation at high frequencies. However, for / < f 
the behavior is also consistent with 1/f noise. 

On time scales much longer than the mean duration of an active period, the time series for the diversity can be 
approximated by constants during the QSS, interrupted by delta-function like spikes corresponding to the active 
periods (see Fig. P). In this limit, the relation between a very wide distribution of the QSS durations r, described by 
a long-time power-law dependence of the probability density p{t) ^ , and the low- frequency behavior of the PSD 
is known analytically [ssf : 



Pif) 



f-iP-'^) for /? < 2 

I/(/|ln/|2) for /3 = 2 

j-(/3-l)(3-/3) fo^ 2 < /? < 3 (21) 

|ln/| for /3 = 3 

const. for /? > 3 



Thus, the approximate 1/f behavior of the PSDs in Fig. [3 is consistent with the approximate l/r^ behavior of the 
probability density for the QSS durations, shown in Fig.^b). 

As well as power-law distributions, PSDs that go as 1//" with awl have been extracted from the fossil record 
|48|. However, such observed PSDs extend only over one to two decades in frequency (corresponding to power-law 
probability densities extending only over one or two decades in time), and more recent work indicates that the 1/f 
spectra obtained in Ref. may be artifacts of the analysis method Although 1/f noise, at least in some 

frequ ency interval, is a property of our model and is possibly also seen in other models of macroevolution 2] (but see 
Refs. pOl l6l| for conflicting opinions on its presence in the Bak-Sneppen model), whether or not it is really present 
in the fossil record remains an open question. 



E. Stationarity and effects of finite genome size 



An important issue in evolutionary biology is whether or not the evolving ecosystem is stationary in a statistical 
sense. In Fig. |Hlwe show several diversity-related measures, averaged over a moving time window and independent 
simulation runs. These quantities are: the species richness M{t), the number N^it) of genotypes with n/ > 2, the 
Shannon- Wiener index D{t), the total population Ntot(t), and the average Hamming distance (H) between genotypes 
in the community and its standard deviation an- As seen in the figure, none of these quantities show any signs of a 
long-time trend over 2^^ generations. This is consistent with fossil evidence for constant diversity |62j |. However, it 
is in disagreement with simulations of the original version of the tangled-nature model in which a slow growth 
in species richness was observed. This discrepancy might possibly result from the different forms of the interaction 
matrix used in the two studies, but it may also be due to the relatively short time series used in Ref. [lo|. 

For our model, a simple, combinatorial phase-space argument (containing neither the individual population sizes 
nor the mutation rate) indicates that the most probable number of major genotypes should indeed be limited. The 

argument goes as follows. A community of Af genotypes can be chosen in (^) different ways and is influenced by 
Af{M — f )/2 different pairs of Mjj and Mjj. We let the probability that the pair of interactions is suitable to forming 
a stable community be q (if the requirement is simply that both AIjj and Mj/ should be positive, then g = 1/4 with 
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our choice of M). Thus, a stable A^-species community can be formed in approximately 



(22) 



ways, which obeys the recursion relation 



AA+ 1 



(23) 



with ri(l) = 2^. This recursion relation can either be used to find the most probable value A^^ numerically, or one 
can easily obtain an estimate valid for 2^ ^ M: M'^ ~ L\n.2 / h\{l / q) . The value numerically obtained with g = 1/4 
is A/"'' = 6, but the close correspondence to the results shown in Fig. may be fortuitous since the dependence of 7V^ 
on q is only logarithmic. 

Another question of interest is whether the limited size of the genome leads to "revisiting" of genotypes and 
communities. The answer is affirmative and indicates the importance of further studies of the effects of the genome 
size on the long-time dynamical behavior. For genotypes the question of revisits is easy to answer, and the results 
for a single run of 2^^ generations are shown in Fig. |7| As we see, in less than 3 x 10^ generations, at least two 
individuals of every genotype have appeared simultaneously (curve labeled n/ > 1). By the end of the run, almost 
every genotype has enjoyed being a major species with n/ > 1001 at least once. Plots on a log versus linear scale (not 
shown) indicate that the curves are reasonably well approximated by exponential approaches to Aymax, indicating that 
genotypes appear to be nearly randomly visited and revisited at a constant rate dependent on nj. 

For communities, the question is more difficult. If a community is simply defined as an internally stable A/'-species 
community, then an estimate for their total number can be obtained from Vl{M) of Eq. 1)22(1 as described in Appendix IbI 
The result with the parameter values used in this paper is of the order of 10"'^^. However, unstable communities are 
also briefly visited, especially during active periods, and a more inclusive way of counting communities would be a 
coarse-graining procedure based on the overlap function defined in Eq. H20|) . The nonzero populations are recorded 
at t = 0, and the overlap with this initial community is monitored until at some i' > it falls below a suitably chosen 
cutoff Ccut- The set {n/(i')} is then recorded as a new community with t' as its starting time. The process is repeated, 
now comparing with the newly recorded community. This procedure creates a list of communities and their starting 
times. To address the issue of revisits, we next scan this list to extract those communities that represent revisits to 
a previously visited community, or "prototype community." We compare each community {n/(i)} in the original list 
with the previously found communities {n/(t)}, sequentially in order of increasing t < t. If none of the overlaps is 
greater than Ocut, {niit)} is added to the list of prototypes A4p(tp) with = t. If an overlap greater than Ocut is 
found at some t, we stop the comparisons and add this community to the list of revisits A^r(ti )- The starting time 
of this revisited community, t-^ = is associated in the list M,-(tj-) with the unique tp = t, the starting time of the 
associated prototype. As the system evolves, the number of items in each list increases monotonically. Clearly, the 
cutoffs should be sufficiently small that communities that differ only by fluctuations inside a QSS are not considered 
different, but sufficiently large to avoid counting significantly different communities as identical. From inspection of 
the overlap fluctuations in some of the QSS included in Table HJ we found that Ocut in the range of 0.90 to 0.95 is 
optimal. Estimates of the total numbers of communities (stable and unstable) that can in principle be distinguished 
by this method are obtained in Appendix IbI Depending on details of the assumptions, the estimates vary between 
10^1 and lO^s for these cutoffs. 

The results of the procedure described above with Ocut — 0.90 and 0.95 are shown in Fig. |Slfor a single run of 2^^ 
generations. The upper pair of curves in Fig.|S|^a) corresponds to Ocut = 0.95, and the lower pair to Ocut = 0.90. In 
each pair, the upper curve shows the total number of communities in the lists A^p (tp) and A^r (ir), while the lower 
curve shows just the number of prototypes in A^p (tp). For Ocut — 0.95, about 40% of the communities are seen to 
be revisits, while the proportion is about 30% for Ocut — 0.90. In both cases, the curve for the number of prototypes 
remains approximately linear, indicating that the supply of previously unvisited communities is nowhere near to being 
exhausted, even for such a long run. In view of the enormous numbers of available communities estimated above, 
this result is reasonable. QSS appear as plateaus in the curves showing the numbers of prototypes. Figure Elb) 
provides a different perspective (for Ocut = 0.95 only). Each revisit is represented by a point with as abscissa 
and tp as ordinate. Thus, every point lies strictly below the diagonal. Long-lived QSS appear as large gaps and 
horizontal segments (e.g., the one near 10^ generations). The inset shows a detail of 10^ generations near the diagonal 
around 1.65 x 10^. As we see, there are many points just below the diagonal, representing the fluctuations around the 
(many short-lived) QSS. By contrast, points far below the diagonal represent "throwbacks" to the vicinity of earlier 
prototype communities. Note that the density of points is much higher just below the diagonal, implying that a large 
portion of the revisited communities are "fluctuation related." To highlight these differences, we show the cumulative 
probability of the ratio tp/t^ in Fig. |HIc), in which the lower curve corresponds to the data in Fig. |SIb). For the 
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case of Ocut = 0.95 (lower curve), we see that about 60% of the revisits can be regarded as "fluctuation related" 
(with tp < ir), while the rest are "throwbacks." Roughly, the latter component appears to be distributed uniformly 
over all earlier times. The upper curve shows the corresponding result for Ocut — 0.90. Corresponding to using a 
more coarse-grained covering of state space, it naturally displays a larger proportion of throwbacks. Not surprisingly, 
there is no sharp distinction between these two components of the revisited communities since even a rough partition 
depends on the details of coarse graining. Nevertheless, we can conclude that the dynamics produces a steady stream 
of essentially new communities drawn from the vast supply of possibilities. 

V. SUMMARY AND CONCLUSIONS 

In this work we have studied, by linear stability analysis and large-scale Monte Carlo simulations, a simplified version 
of the tangled-nature model of biological coevolution, recently introduced by Hall, Christensen, and collaborators 
Ull [T3 |. Selection is provided by interspecies interactions through the reproduction probability Pj [Eq. (^], 
which corresponds to a nonlinear population-dynamics model of the community ecology, while the genetic variability 
necessary for evolution is provided by a low rate of mutations that act on individual organisms during reproduction. 

At the low mutation rate studied here, the model provides an intermittent, statistically self-similar behavior, 
characterized by periods of relative calm, interrupted by bursts of rapid turnover in genotype space. During the 
quiet periods, or quasi-steady states (QSS), the population consists of a community of a relatively small number of 
mutualistically interacting genotypes. The populations of the individual genotypes, ?T./(i), fluctuate near a stable fixed 
point of a deterministic mean-fleld, mutation- free version of the model [Eq. Q]. During the active periods, the system 
moves through genotype space at a rate that is rapid on a macroscopic ("geological") time scale, although of course 
finite on a microscopic ("ecological") scale. These periods of rapid change are characterized by large fiuctuations 
in the diversity and an overall reduction of the total population. In long simulation runs of 2^^ generations, the 
ecosystem spends about 97.4% of the time in QSS. The time series produced by the model are statistically stationary, 
and there is no evidence that any particular quantity is being optimized as the system moves through genotype space. 
In that sense, the dynamical behavior can probably best be described as a neutral drift Overall, the dynamical 
behavior of this model resembles closely the punctuated-equilibria mode of evolution, proposed by Gould and Eldredge 

Elliip. 

Investigation of duration statistics for the quiet periods shows a very wide distribution with a power-law like long- 
time behavior characterized by an exponent near —2. Consistent with this result, the power spectral densit y of the 
diversity shows 1// noise. While there are claims that similar statistics characterize the fossil record pMl50ll5lll53 . l53| . 
this is still a contested issue ^,_59J. At best, observations of power-law distributions and 1// noise in the fossil record 
extend over no more than one or two decades in time or frequency, and it must remain an open question whether this 
is the optimal interpretation of the scarce data available. 

Due to the absence of sexual reproduction, our model can at best be applied to the evolution of asexual, haploid 
organisms such as bacteria. It should also be noted that no specific, biologically relevant information has been included 
in the interaction matrix M. In particular, this fact may be responsible for our QSS being strongly dominated by 
mutualistic relationships. The absence of biologically motivated detail in M is both a strength and a weakness of the 
model. Its strength lies in reinforcing the notion of universality in macroevolution models, e.g., power law behaviors 
and 1// noise. By the same token, its weakness lies in the lack of biological detail, to the point that comparison with 
specific observational or experimental data is difficult. Clearly, the detailed effects of interspecies interactions on the 
macroevolutionary behavior in models similar to the one studied here represents an important field of future research. 
Examples include the importance of the connectivity of the interaction matrix, correlated interspecies interactions 
|63). and interaction structures corresponding to food webs with distinct trophic levels. 

Despite all these caveats, we find it encouraging that such a simple model of coevolution with individual-based 
dynamics can produce punctuated equilibria, power-law distributions, and 1// noise consistent with current theories 
of biological macroevolution. We believe future research should proceed in the direction pointed out by this and 
similar models. This entails combining stochastic models from community ecology with models of mutations and 
sexual reproduction at the level of individual organisms, and investigating the consequences of more biologically 
realistic interspecies interactions. 
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APPENDIX A: MASTER EQUATION 



A complete description of the stochastic process can be given in terms of a master equation, which specifies the evo- 
lution of V {n, t), the probability that the system is found with composition n at time t. Here, n = {m, n2, n^^^^^}, 
and A/'max = 2^ is the number of individuals of species /. In our case, L = 13 and A/'max = 8192. Similar to the main 
text, we define Atot = J2i "-f ^"^^ -^o be the carrying capacity. 

We write the probability for an individual of species / to survive to reproduce as 



Pi{n) = ll + exp 



A^o 



J 



(Al) 



The main difference between this expression and Eq. |^ lies in the interpretation. Here, n is a "coordinate variable" 
in the A/'max-dimensional space, in contrast to nj (t) being just a point in this space. 

To proceed, we define the symbol [^^^] as the rate for survival (from nj individuals to mj "mothers"). Since each 
individual is given a chance to survive according to Pj, we have 



71/ 



nil 



mil (ni — m/)! 



{Pir' (1 - p,)"^-™^ 



(A2) 



which has the form of a binomial distribution. Next, each mother gives rise to F offspring. However, due to mutations, 
not every offspring is of the same genotype as the mother. Although it is possible to have mutants with a genotype 
differing from the mother by more than one bit, we restrict ourselves here to a simpler version, namely mutant 
genotypes that can differ only by one bit. Since our simulations typically involve ^ ~ 10""^, this restriction should 
not lead to serious difficulties. Given that only one bit may be flipped, there are L + I possible varieties of offspring 
for each maternal genotype. We introduce the notation 

bjfl for the number of offspring from mother J with no mutations 
bj^k for the number of offspring from mother J with the fcth bit flipped 

We now define the multinomial-like symbol 



Fmj 



jFrnj)! 
{bj.o)l 



which is the probability that the Fmj offspring are distributed into the specific set {6j_o,6j,i, 
ingredient needed is the connection matrix 



J,k 



1 if genotype G is J with the fcth bit fiipped 
otherwise 



so that the number of offspring born into species G due to mutations is 

6g = ^ ^c'^bj^k ■ 

J,k 



The final result is 



P(n',i+1)= J2 I[s{^G-bG,o-bG)Y[ 

n,m,{b} G J 



Fmj 



n 



ni 
mi 



V (n, t) 



(A3) 
The last 

(A4) 
(A5) 

(A6) 
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In principle, once the dynamics of V is found, the time dependence of various quantities (e.g., averages and 
correlations) can be computed. For instance, 

{ni)^ = Y,niV{n,t) (A7) 

n 

is the average number of individuals of species / at time t. From this very detailed description, we see that the 
evolution of various quantities can be derived. However, these evolution equations are extremely complex in general 
and progress is usually possible only by making a "mean-field" approximation, in which all correlations are neglected. 
Thus, averages of products of n are replaced by the products of the averages. As a result, a non-linear evolution 
equation for (n)^ results. In a subsequent paper we shall show that Eq. ^ is precisely the result of such 
an approach (with (n)^ denoted by {n/(t)}). We shall also demonstrate that, for a QSS, V {n,t oo) is well 
approximated by a Gaussian (for fi,l/No ^ 1), the center of which is just the fixed point given by Eq. H12I) . The 
width of this Gaussian is governed in part by the community matrix A of Eq. 114() . In addition to this (systematic, 
"drift") part, there is a part governing the noise correlations. Between the two, we can compute distributions of step 
sizes (n' — n). For a long-lived QSS, it is easy to compile data so that quantitative comparisons with these predictions 
can be made. 



APPENDIX B: COUNTING COMMUNITIES 



In this Appendix we obtain estimates for the numbers of different communities that could in principle be observed 
in an infinitely long simulation. Even our most conservative estimate is vastly larger than the numbers observed in a 
simulation run of 2^^ generations in Sec. IIV El 

If a community is simply defined as an internally stable A/'-species community, then an estimate for their total 
number can be obtained from il{J\f) of Eq. H22I) . Since il.{J\f) is quite sharply peaked around A/"^, we can estimate 

that Ea/-^(-^) - ^{J^^) ~ 2^^^g^^(^^-i)/2 « (2"/^)"^' with a = ln2/ln(l/g). For q = 1/4 and L = 13, this 
gives r2(A/'^) w 5.2 x 10^^, while direct summation of ^l{J\f) with the same parameters gives J2j\f^i-^) — ^-^ ^ 1^^^ 
with 46% due to the contribution from A/" = A/"^ =6 and convergence by A/" = 11. As it does not include unstable 
communities, this would serve as our most conservative estimate of the total number of communities that could be 
visited in an infinitely long simulation. 

To estimate the total number of different communities that can be distinguished by the overlap cutoff method 
described in Sec. IIV El we note that the overlap threshold Ocut is simply the cosine of the angle between two vectors 
in population space, Ocut = cos9. For convenience, we define e = 1 — Ocut: such that e is small (0.1 and 0.05). Then, 
9 ~ arccos(l — e) sa \/2e. The fact that the number of populated species at any one time is near J\f\ means that 
the direction of the population vector {n/} to a reasonable approximation lies within the first hyper-octant of one of 
("^t') A/"^ -dimensional hyperspheres. The total number of communities that can be distinguished with a cutoff Ocut 
is the ratio of the total A/"^ -dimensional solid angle covered by the hyper-octants to that of a hyper-cone subtended 
by 9. The solid angle of the hyper-octants is 

where Sd — 2tt'^/'^ /T{d/2) is the surface area of a d-dimensional unit sphere. The solid angle of the hyper-cone is 

for small e. Using Stirling's approximation for the gamma functions in Sd, ^(z) ~ (z/e)^-y/27r/z and the identity 
lim„^oo (l — — )" = e^^, we estimate the total number of different communities to be 



^'^'^ ^ Ut j (^^ • 

This estimate is reasonable only as long as e is large enough to exclude minor fluctuations within QSS. Observation of 
overlap fluctuations in some of the QSS included in Table indicates that Ocut in the range of 0.90 to 0.95 is optimal. 
With the parameters used in this work, Eq. yields M ~ lO^^ for Ocut = 0.90 and M ~ 10^2 for 0.95. If, instead 
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of using Af^ = 6 we assume that the number of genotypes at any time is near 8, as indicated by the top curve in 
Fig. El these estimates are increased by six orders of magnitude. 
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TABLE I: Composition and lifetime statistics of ten QSS that lasted at least 20 000 generations in a particular simulation 
run of 10^ generations. The QSS are listed in order of increasing mean lifetime. Columns one through four give the genotype 
labels and, in parentheses, the initial populations [the fixed-point populations given by Eq. (1121 1 for the A/" genotypes in each 
QSS. Columns five and six give the population-weighted mean and standard deviation of the A/" (A/" — l)/2 Hamming distances 
between the genotypes in the initial community, Eqs. 11811 and 119II . respectively. Columns seven and eight give the mean 
and standard deviations of the A/" (A/" — 1) offdiagonal interaction matrix elements Mij between the genotypes in the initial 
community, respectively. Columns nine and ten give the mean and standard deviations of the lifetimes, obtained from 300 
independent escapes for each QSS. See discussion in Sees. lIV'^ and lTV CI 
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FIG. 1; (Color online.) Time series from a simulation of 10® generations. The model parameters, which are the same in the 
subsequent figures, are: mutation rate ^ — 10~^ per individual, carrying capacity A^o = 2000, fecundity F — i, and genome 
length L = 13. Top curve (black): Shannon- Wiener diversity index D{t) = exp[S{{ni (t)})]. Bottom curve (gray, red online): 
normalized total population A'tot(t)/[A^o ln(_F — 1)]. (a) The whole time series, showing intervals of quasi-steady states (QSS) 
separated by periods of high activity, (b) Part of one of the periods of high activity, shown on a time scale expanded twenty 
times. Several shorter QSS are resolved between bursts of high activity. Comparison with (a) suggests statistical self-similarity. 
See discussion in Sees. llV'Xl and llV CI 
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FIG. 2: (Color online.) Genotype label I versus time for the same simulation run shown in Fig. In order of decreasing 
darkness from black to very light gray the symbols indicate ni > 1001, ni € [101, 1000], ni G [11, 100], rn G [2, 10], and ni = 1. 
(Online the colors are black, blue, red, green, and yellow in the same order.) Note that the difference between the label for two 
species bears no simple relation to their Hamming distance. Each QSS is composed of a different set of species, punctuated by 
periods during which the population moves rapidly through genotype space, (a) Corresponding to Fig. H[a) , sampled every 960 
generations to facilitate plotting, (b) Corresponding to Fig. ^b), sampled every 48 generations. See discussion in Sees. IIV'XI 
and lTVCl 
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FIG. 3: (Color online.) Histograms of the multiplication rate (exponential of the invasion fitness) for mutant species i against 
each of ten specific QSS (thick, black lines), compared with the same quantity against randomly chosen, feasible communities 
[thin, gray lines (green on line)]. The multiplication rates are calculated from Eg. 1151 . (a) When the mutant species differ 
from the resident community by a Hamming distance of one (nearest neighbors |4Qj). (b) When the mutant species differ from 
the resident community by a Hamming distance of two (next-nearest neighbors). See discussion in Sec. IIV 1^ 




FIG. 4: (Color online.) (a) Normalized histogram of entropy changes, averaged over 16 generations (thick curve with shading). 
Based on 16 independent runs of 2^^ = 33 554 432 generations each. The near-Gaussian central peak corresponds to the QSS, 
while the near-exponential wings correspond to the active periods. Also shown by a thin line is a histogram based on birth/death 
fluctuations in ten specific QSS communities with zero mutation rate. The latter is re-normalized so that the maxima of the 
two histograms coincide. Based on these histograms, a cutofi^ of [S{t) — S{t — 16)]/16 = 0.015 was used to distinguish between 
active periods and QSS. (b) Normalized histograms for the length of active periods (-I-) and QSS [solid, light gray circles (cyan 
online) and solid, dark gray squares (red online)]. Based on 16 independent runs of 2^^ generations each. Two of the histograms 
(4- and circles) use a constant bin width of 16 generations. In order to capture the information for large durations, the data 
for the QSS were also analyzed with exponentially increasing bin size (squares). Error bars showing standard error based on 
the spread between the individual runs are shown only where they are larger than the symbol size. The black curve through 
the points for the active periods is a least-squares fit of an exponential distribution to the data. The straight line is a guide to 
the eye, corresponding to l/x^ behavior for the QSS data. See Sec. IIV (Jl for details. 
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FIG. 5: (Color online.) Power spectral densities (PSD) (54^| for simulations of length 2^^ generations, sampled every 16 
generations and averaged over 16 independent runs. Black curve: no variance reduction. Gray curve (red online): 16-fold 
variance reduction. Error bars shown for the eight lowest frequencies in each curve are standard errors, based on the spread 
between the individual runs. The straight line is a guide to the eye, corresponding to 1// behavior, (a) Shannon- Wiener 
diversity index D{t) — exp[S{{ni{t)})]. (b) Normalized total population Ntot{t) /[No \n{F — 1)]. See discussion in Sec. llV i31 
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FIG. 6: (Color online.) Combined time-window and ensemble averages of various diversity-related measures. Each data point 
represents an average over a time window of 2^^ — 524 288 generations and over 16 (circles) or seven (triangles) independent 
runs of 2^® generations each. Shown are, from above to below, the total number of species A/'(t), the number M2{t) of species 
with ni > 2, the Shannon- Wiener index D{t), the average Hamming distance (H) between genotypes in the community, the 
normalized total population Ntot{t)/[Noln{F — 1)], and the standard deviation cth of the Hamming distances. These data 
indicate the absence of any systematic long-time trends in the dynamical behavior. See discussion in Sec. II V El 
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FIG. 7: (Color online.) The number of different genotypes visited at different population levels, shown vs time for the first 
half of a single run of 2^^ generations. From above to below, ni > 2, n/ > 11, m > 101, and ni > 1001. Horizontal plateaus 
correspond to QSS. See discussion in Sec. nvEi 
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FIG. 8: (Color online.) (a) Number of communities visited, shown vs time for a single run of 2^^ generations. A new community 
is counted whenever the overlap function falls below Om^- The upper pair of curves corresponds to Ocut = 0.95, and the lower 
pair to 0.90. The top curve in each pair (thin, gray curve, red online) counts both prototype communities and revisits, while 
the bottom curve (heavy, black curve) excludes revisits, (b) For Ocut = 0.95 only, the abscissa tr gives the starting time of each 
revisit, while the ordinate gives the starting time of the associated prototype community. Inset: detail for t-p ^ t-i over 10^ 
generations, (c) Lower curve: cumulative histogram for fp/fr from part (b). About 60 % of the revisits are to other recently 
visited communities, while the rest are approximately uniformly distributed over all previously visited communities. Upper 
curve: corresponding result for Ocut = 0.90. See discussion in Sec. llV E!I 
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